Combining Dictionary-Based and Example-Based Methods for Natural Language Analysis
نویسندگان
چکیده
We propose combining dictionary-based and example-based natural language (NL) processing techniques in a framework that we believe will provide substantive enhancements to NL analysis systems. The centerpiece of this framework is a relatively large-scale lexical knowledge base that we have constructed automatically from an online version of Longman's Dictionary of Contemporary English (LDOCE), and that is currently used in our NL analysis system to direct phrasal attachments. After discussing the effective use of example-based processing in hybrid NL systems, we compare recent dictionarybased and example-based work, and identify the aspects of this work that are included in the proposed framework. We then describe the methods employed in automatically creating our lexical knowledge base from LDOCE, and its current and planned use as a large-scale example base in our NL analysis system. This knowledge base is structured as a highly interconnected network of words linked by semantic relations such as is_a, has_part, location_of, typical_object, and is_for. We claim that within the proposed hybrid framework, it provides a uniquely rich source of information for use during NL analysis.
منابع مشابه
A Mono-lingual Corpus-Based Machine Translation of the Interlingua Method
This paper describes a prototype of an example-based machine translation system. In this system, key language resources are EDR corpus and concept classification dictionary. The corpus consists of a pair of sentences, their morphological representations, their syntactic representations, and their semantic representations. The semantic representations are described by an interlingua. Therefore t...
متن کاملDictionary of Abstract and Concrete Words of the Russian Language: A Methodology for Creation and Application
The paper describes the first stage of a project on creating an electronic dictionary with numerical estimates of the degree of abstractness and concreteness of Russian words. Our approach is to integrate data obtained from several different sources: text corpora, psycholinguistic experiments, published dictionaries, markers of abstractness (certain suffixes) and a translation of a similar dict...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملA Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملAn Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کامل